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A critical problem in spectrum sensing is to create a detection algorithm and 
test statistics. The existing approaches employ the energy level of each 
channel of interest. However, this feature cannot accurately characterize the 
actual application of public amateur radio. The transmitted signal is not 
continuous and may consist only of a carrier frequency without information. 


This paper proposes a novel energy detection and waveform feature 

classification (EDWC) algorithm to detect speech signals in public 
Keywords: frequency bands based on energy detection and supervised machine learning. 
The energy level, descriptive statistics, and spectral measurements of radio 
channels are treated as feature vectors and classifiers to determine whether 
the signal is speech or noise. The algorithm is validated using actual 
frequency modulation (FM) broadcasting and public amateur signals. The 
proposed EDWC algorithm's performance is evaluated in terms of training 
duration, classification time, and receiver operating characteristic. The 
simulation and experimental outcomes show that the EDWC can distinguish 
and classify waveform characteristics for spectrum sensing purposes, 
particularly for the public amateur use case. The novel technical results can 
detect and classify public radio frequency signals as voice signals for speech 
communication or just noise, which is essential and can be applied in 
security aspects. 
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1. INTRODUCTION 

The radio spectrum remains the radio frequency (RF) part of the electromagnetic spectrum, which 
is considered a limited source. With the advancement of communication technology, government agencies 
must supervise the management of the frequency band following rules to avoid mutual interference. 
Therefore, monitoring spectrum usage and recording usage statistics are essential for the development, 
improvement and issuance of regulations under actual use conditions, particularly regarding the available 
frequencies of public amateur radio. The technology that can be used to support this activity is cognitive 
radio (CR), which has been used extensively in solving the problem of frequency density, as demonstrated in 
[1], [2]. 

Due to the increasing demand for radio frequency communication, it is very challenging to exploit 
these limited or underutilized spectral resources by using CR technology, as presented by [3]. One of the 
essential elements of CR theory is the ability to measure, understand, determine and be informed of the 
parameters related to radio channel properties, as shown by [4], [5]. The main features of CR are spectrum 
sensing, spectrum decision, and spectrum sharing and spectrum mobility, as shown by [6], [7]. Spectrum 
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sensing is the responsibility to obtain knowledge about the spectrum usage and presence of users in a 

geographical area. As demonstrated by [8], [9], the basic spectrum sensing techniques are energy detection 

(ED), matched filter detection, cyclostationary detection, and certain other detection techniques, each of 

which has operational specifications, benefits and limitations. ED is a successful and uncomplicated 

technique that is particularly suited to a random signal, and it will be considered in this paper. 

ED is one of the simplest methods of detection technology because the CR receiver does not require 
any information about the samples received previously. Notably, its purpose is to process the received samples to 
estimate the energy level in the channel. As demonstrated by [10], the authors proposed a method to use ED 
after optimally combining the signal samples received in space and time based on the principle of maximizing 
the signal-to-noise ratio (SNR). The determination of the threshold is the critical parameter in the classical 
energy detector. It must be optimized for each detection technique to improve its performance, as demonstrated 
by [11]-[13]. In a wide-band spectrum sensing scenario, a subband ED method can perform effectively under noise 
uncertainty and frequency-selective channels and the implementation of filter bank spectrum sensing, as shown by 
[14], [15], respectively. However, the fundamental principle of ED is to compare the signal energy to a sensing 
threshold in a given bandwidth within a specific sensing period, as demonstrated by [16]. 

Many researchers have focused on simulating and making real-time measurements for a wide range 
of environments and conditions. Koley et al. [17], Varma and Mitra in [18] used NI-USRP, which interfaced 
with a system through LabVIEW software to act as an RF transceiver. A wireless open-access research 
platform (WARP) board was implemented in real-time ED, as demonstrated by [19], [20]. Moreover, the 
RFeye sensing node was used to record signals for radio spectrum monitoring purposes, as shown by [21]. 
Another interesting issue, as presented by [22], is the case in which the transmitter switches from active to 
interactive at random time intervals. This paper uses a ZedBoard combined with the analog devices AD- 
FMCOMMS3 module as the CR receiver in the experimental setup. The modules are controlled and 
processed with a program developed in MATLAB. 

It is now widely accepted that artificial intelligence technology performs essential functions in every 
field; for example, there is a machine learning approach to ranging error migration for localization 
algorithms, as shown by [23]. Numerous machine learning techniques, including both supervised and 
unsupervised machine learning algorithms, have even been used and applied in spectrum sensing 
applications, as demonstrated by [24]-[27]. In addition, detection and classification based on waveform 
characteristics have been investigated in numerous areas, such as seismic signals, electrocardiogram signals 
and multiplexing signals, as shown by [28]-[30]. The combination of machine learning performance and 
wave character analysis can be used to design novel models that can operate more efficiently for spectrum 
sensing purposes. 

In actual use, a particular frequency spectrum has diverse characteristics and applications. The 
Office of National Broadcasting and Telecommunications Commission, Thailand, has determined the control 
of the frequency band in the National Table of Frequency Allocation, as shown by [31], by specifying the use 
of the frequency range 134-174 MHz for amateur public radio. The number of amateur radio users in 
Thailand is continuously increasing. However, there is still a lack of statistics on usage, including the 
disturbance of the frequency spectrum in the amateur radio band, which is very important for the agencies 
responsible for governing the allocation of spectrum resources. 

Motivated by the above challenges, this paper proposes an energy detection and waveform feature 
classification (EDWC) algorithm for amateur public radio based on ED techniques and waveform 
characteristics that use machine learning algorithms. The only prior information required is the bandwidth of 
each channel $B$. The proposed EDWC algorithm consists of two processes: ED and waveform 
classification. The waveform classification process includes two steps: i) the training phase and ii) the 
identification of clusters as sound or noise signals. To the best of the author's knowledge, detection and 
machine learning techniques have not been adopted for spectrum sensing in the amateur frequency band in 
the existing literature. The main contributions of this paper are summarized below. 

— In contrast to the existing methods, this paper introduces a developed detection and classification 
framework, which combines the performance of ED and demodulated waveform classification for test 
statistic design and utilizes a threshold and waveform feature-based mechanism for real-time detection. 

— Under the EDWC framework, this paper proposes supervised learning approaches such as the 
classification tree (CTR), discriminant analysis (DCA), naive bayes classifier (NBC), $k$-nearest 
neighbours (KNN), and support vector machine (SVM) algorithms. 

— This paper conducts extensive experiments using real captured samples. The results verify the efficiency 
of the proposed algorithm in terms of its detection performance and scalability. The performance of each 
classification technique is evaluated in terms of the training time and the receiver operating characteristic 
(ROC) curve. 
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The rest of this paper is organized as follows: the system model is presented in section 2. The 
EDWC algorithm framework is proposed in section 3. The experimental results and discussion are presented 
in section 4. Finally, conclusions are drawn in section 5. 


2. SYSTEM MODEL 

The problem of spectrum sensing is to determine whether a particular part of the spectrum is 
accessible or not. Therefore, we can express the spectrum sensing problem as a binary hypothesis testing 
problem at the discrete-time instant t: 


Ho : y(t) = n(t) a) 
H1 : y(t) = s(t) + nt), (2) 


where hypotheses Hy and H, indicate the absence and presence of the primary signal, respectively, y(t) 
refers to the signal received at the location of the CR system, n(t) is additive complex white Gaussian noise 
with zero mean and s(t) represents a signal transmitted by the primary node. 


2.1. Energy detection 

The energy detector contributes to energy evaluations corresponding to the above binary hypothesis. 
Let y(n) be the n-th (n = 1, 2, ..., N) sample of y(t). All the samples are placed into the vector 
y = [yQ),y(2),..., yCN)]". Typically, the decision statistic T(y) based on N received samples can be given 
by (3): 


Ho 


TO) = Xhly:l? 2 A, (3) 
H, 


where À is a predefined decision threshold. The reliability correlated with the decision rule in (3) can be 
characterized by the probability of detection Py and the probability of false alarm Pr. The former is the 
probability of exposure of the primary signal when it is present in the frequency band and can be formulated 
mathematically as (4). 


Pa = Pr(T(y) > AH). (4) 


The false-alarm probability represents the incorrect decision that s(t) is present in the frequency band when 
it is actually not, and it may be written as (5). 


P; = Pr(T(y) > AlHo). (5) 


The decision threshold is the crucial parameter in (3) and must be optimized for each detection 
technique to enhance its performance. In general, the decision threshold is chosen to make Py as large and Pr 
as small as possible. The threshold is commonly set based on a constant false-alarm probability as (6): 


P; = Pr(T(y) > AlHo). (6) 


where Q is the standard Gaussian complementary cumulative distribution function, noting that the decision 
threshold must be adjusted based on the variance of the additive noise. 


2.2. Machine learning 
Machine learning algorithms learn a target function f that best maps input variables X to an output 
variable Y. This objective is expressed for a machine learning algorithm as (7). 


Y = f(X), (7) 
Xia X20 Xan 
xX xX eee Xx: 
With X=] PP F , Oe (8) 
Xmı Xm2 ‘© XmMn 
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Where M is the sample size and N is the number of features for each observation. Each pair of matrix (X, Y) 
is called a training sample because it is used to guide the learning algorithm how to obtain the predictor f. 
There are two classical data models that depend on the prediction type. If the outcome variable Y is 
quantitative, the learning problem signifies a regression problem; if the output variable Y is a definite value, 
it is a classification problem. 

A classification problem is a kind of supervised machine learning task in which an algorithm learns 
to classify new observations from examples of an output variable. The classification efficiency of machine 
learning models depends greatly on the selection of the dataset representation or features used for training. In 
this paper, we use the CTR, DCA, NBC, KNN, and SVM algorithms for training and classifying datasets. 
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2.3. Demodulated waveform characteristics 

In this paper, we focus on the signals of amateur radio communication, which are based on 
frequency modulation (FM). The receiver's demodulated signal is a signal in the audible frequency band or 
voice signal. The demodulated wave characteristics will vary depending on the nature of the speech or voice. The 
key variables used to express the values of the critical signals are descriptive statistics and spectral measurements. 


2.3.1. Descriptive statistics 

Descriptive statistics are used to represent the basic features of a signal. They provide summary 
characteristics for the signal sample and the measures, e.g., the maximum elements of an array (max), 
minimum elements of an array (min), average or mean value of an array (mean), median value of an array 
(med), maximum-to-minimum difference (p2p), root-mean-square (RMS) level (rms), peak-magnitude-to- 
RMS ratio (p2rms), root-sum-of-squares level (rssq), standard deviation (std), and variance (var). 


2.3.2. Spectral measurements 

Spectral measurements can represent an electrical properties according to its frequency. Each 
frequency element included in the input signal is displayed as a signal level corresponding to that frequency 
band of interest, e.g., the mean frequency (meaf) and median frequency (medf). This paper uses both 
descriptive statistics and spectral measurement parameters as the classification data features. In the additional 
content concerning the model training, we demonstrate the feasibility and contribution of the classification 
data features to the waveform characteristic classification. 


3. PROPOSED EDWC ALGORITHM 
The processing pipeline of the proposed EDWC algorithm framework is shown in Figure 1. The 
pipeline consists of data acquisition, data preprocessing, model development, and classification and decision steps. 
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Figure 1. Processing pipeline of the energy detection and waveform feature classification (EDWC) algorithm 
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3.1. Data acquisition 

In the present work, the performance of the proposed EDWC algorithm is validated using a 
combination of Avnet ZedBoard with the analog devices AD-FMCOMMS3-EBZ FMC module. Table 1 
presents hardware specifications in a defined range of RF spectra. The proposed algorithms are implemented 
with MATLAB R2019 in a 64-bit computer with a core i5 processor and 4 GB RAM. 


Table 1. Hardware specifications 


Parameter Value 
RF transceiver 2xT, and 2xR, 
Frequency range 70 MHz to 6.0 GHz 
Channel bandwidth <200 kHz to 56 MHz 
RF inputs (peak power) 2.5 dBm 
Operating temperature range -40° C to +85° C 


Figure 2 shows the experimental setup, where FMCOMMS3 and ZedBoard interface with the 
system through MATLAB software. The antenna AOR DAG735G is connected to the Rx port of the 
FMCOMMS3 board and can cover a frequency range of 75 MHz to 3 GHz. The receiving antenna is located 
at 13.767756°N, 100.530569°E, and the height is approximately 20 meters above the ground. 


Figure 2. Experimental setup 


For our training dataset, the experimental setup records 30000 RF signals every ten seconds with a 
specific carrier frequency. We use real broadcasting FM radio signals to train the developed model to classify 
and distinguish waveform characteristics. We use another 30000 RF signal datasets to test the performance of 
our developed machine learning algorithms. 

For application purposes and for planning the use of the public spectrum, we implemented the 
developed framework to maintain a one-week cycle usage statistic for FM amateur radio. The available 
frequency bands for FM amateur radio according to [31] are divided into four sections as follows: Band 1 
between 144.5125 MHz and 144.9875 MHz, Band 2 between 145.1375 MHz and 145.5375 MHz, Band 3 
between 146.2875 MHz and 146.6000 MHz, and Band 4 between 146.8125 MHz and 147.0000 MHz. Each 
channel has a bandwidth of 12.5 kHz. 

Figure 3(a) and (b) illustrate examples of the instantaneous spectrum and the spectrogram of the real 
FM radio signal, respectively, versus frequency. As shown in Figure 3(a), the spectrum of the RF signal 
varies depending on the modulated voice signal. Figure 3(b) presents the spectrogram of the same RF signal 
with a time history of 100 ms. 


3.2. Data preprocessing 

The analogue RF signals at the specified frequency range are converted to the intermediate 
frequency (IF) and stored for classification processing. The potential predictor variables used in this study are 
the descriptive statistics and spectral measurements of the FM demodulated signal of each channel, as 
described in section 2.3. Figure 4 shows a comparison of the waveform and amplitude obtained from the 
demodulation processing of one dataset. The diagram clearly shows the waveform characteristics of each 
signal type. The solid line represents the waveform of the voice or speech signal of FM radio broadcasting. 
The dashed line depicts the waveform of the noise signal. The waveform characteristics are different, and we 
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can use the waveform properties in each dataset as variables in processing the waveform relationships and 


signal types. 
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Figure 3. Example from the RF signal dataset; (a) spectrum of RF signal, (b) spectrogram of RF signal 
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Figure 4. Example of demodulated waveform 


3.3. Model development 
The purpose of machine learning is to develop a model that makes classifications based on input 


data or features. A supervised learning algorithm uses a certified set of input data and known corresponding 
outputs and instructs a model to create logical classifications in response to new data, as described in 
algorithm 1. The learning process begins with an input data matrix X. Each row of X represents one 
observation or measurement. Each column of X denotes one feature or predictor. After model fitting, we 
obtain several models depending on the algorithms. These models will be used to classify the output. In this 
case, we have two categories: voice or speech waveforms and noise waveforms. 


3.4. Classification and decision 
A measure of energy level will indicate if a signal is transmitted in that frequency band or not. The 


application of classic ED techniques can provide only Hg or H, status, as presented in (1). However, in 
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practical applications with radio amateurs, there is also a form of noise transmission. Which the noise is sent 
out, there will be no audio or speech signal. For example, press and hold the submit key. This method of 
analysis, therefore, further classifies the form and nature of the measured signal. The developed EDWC 
algorithm will be beneficial in further applications for security agencies. 

The process of classifying and making decisions is a combination of the capabilities of ED and the 
analysis of voice signals using machine learning algorithms, as described in algorithm 2. According to the 
preprogrammed processing steps, the developed board captures the RF signal in real time. Then, it filters the 
wideband signal to the subband according to the respective channel and bandwidth size. Next, all descriptive 
statistics and spectral measurements are calculated to prepare the input row of X. 


Algorithm 1. Model development 
Input: Wideband RF sample data 
Output: Classification models (CTR, DCA, NBC, KNN, SVM) 
Initialization: Training dataset acquisition 
Loop Process: 
for i= 1 to number of channels do 
Frequency band selection using bandpass filter 
Calculate features of each frequency band 
Preprocessing input data matrix X 
for n=1 to number of machine learning models do 
Train models 
end for 
Save model 
end for 
Test performance of each model 


Algorithm 2. Classification and decision 
nput: Wideband RF sample data 
Output: Decision result 
nitialization: 
Test data acquisition y(t) 
Threshold estimation A 
Load classification models (CTR, DCA, NBC, KNN, SVM) 
Loop Process: 
for i= 1 to number of channels do 
Frequency band selection using bandpass filter 
Calculate features of each frequency band 
Preprocessing input data matrix X 
for n=1 to number of machine learning models do 
Energy detection T(y) 
Waveform classification (WC) 
Decision based on EDWC algorithm 


if (T(y) < A) and (WC == Noise) then 
Decision case Co 

else if (T(y) 2 A) and (WC == Voice) then 
Decision case C 

else if (T(y) > A) and (WC == Noise) then 
Decision case C? 

end if 


end for 
end for 
Count classification and decision results 


As mentioned above, the power splitting method only provides information if there is a signal in the 
observed frequency channel or not. Furthermore, once it is identified that some signal power is detectable, it 
is the process of analysis to classify it as a speech signal or noise. The algorithm is classified into three 
subgroups, Co, C;, and C2. 

The classification models process the input data and classify the waveform features into two groups: 
(WC = Voice) and (WC = Noise). The ED module compares the energy level with the predefined threshold 
and gives the comparison results: Họ or H,. In the decision step, we define the decision output based on 
waveform classification and ED as follows: 

— Co when (T(y) < à, Hj) and (WC=Noise): In this case, the signal level is weaker than the regular reference 
rate, and the resulting waveform characteristics are generally similar to that of a noise signal. 

— Cı when (T(y) 2 à, Ho or H4) and (WC=Voice): Suppose the measured energy level is smaller than the 
specified threshold level, but the waveform characteristics are similar to voice signals. In this case, the 
decision algorithm will classify the detected signal into the voice group. There is a possibility that the 
transmitter is at a great distance, causing the signal intensity to decrease. However, the waveform 
characteristics indicate that it may be a voice signal employed for real communication. 
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— Cz when (T(y) > à, H,) and (WC=Noise): In public amateur radio use, there may be accidental or 
intentional interference by the user. Alternatively, the user may transmit a carrier wave signal without 
modulation with a speech signal. In this research, a decision making model was designed to take the 
actual situation into account. In other words, the signal level may be greater than the threshold due to the 
transmitted carrier frequency. However, the waveform does not have the characteristics of a voice signal 
as defined in the machine learning model. 


4. RESULTS AND DISCUSSION 

In this section, we conduct extensive simulations to verify the performance of the proposed EDWC 
algorithm. In particular, we evaluate the training performance of the classification scheme in section 4.1. 
Then, we demonstrate the testing performance and the detection probability of the different algorithms in 
section 4.2. Finally, we assess the performance of real-time observation applications using real amateur 
public radio in section 4.3. 


4.1. Training dataset 
4.1.1. Corellation coefficient of features 

Based on investigating Figure 5, we find a significant correlation between individual waveform 
characteristics. Most of the correlation coefficients of the selected features are higher than 0.3; i.e., there is a 
robust correlation. Therefore, using the waveform properties as variables in machine learning processing can 
lead to reliable and practical results. 


4.1.2. Training duration of different algorithms 

The average training durations for the different classifiers according to the size of the training 
feature vectors are displayed in Table 2. The nearest neighbor algorithm displays a comparatively high 
training duration (5.0926 seconds for 30000 samples) among all the machine learning algorithms. The 
algorithm that used the least time to train the dataset in this experiment was discriminant analysis, with 
0.3026 seconds for 30000 samples. 
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Figure 5. Heat map of the interrelated features 


Table 2. Average training duration for different machine learning algorithms [seconds] 
Number of Training Samples 
6000 10800 15200 20400 25200 30000 
CTR 0.1276 0.1594 0.1899 0.2218 0.3450 0.4387 
DCA 0.1404 0.1679 0.2111 0.2385 0.2770 0.3026 
NBC 0.1605 0.1999 0.2434 0.2636 0.3021 0.3407 
KNN 0.3181 0.7964 1.5140 2.4617 3.6306 5.0926 
SVN 0.4886 1.2922 2.0565 2.1238 2.7555 3.4670 


Algorithms 
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4.2. Test dataset 
4.2.1. Classification time of different algorithms 

Table 3 presents the time needed for classification of the waveform characters for different 
classifiers based on 30000 test samples. The different numbers of samples used in the estimation process are 
presented in the “number of classification samples” column, from 6000 to 30000 datasets. In the processing 
used to classify the signal waveform, the proposed EDWC algorithm with decision trees can obtain the most 
desirable classification time (0.0125 seconds for 30000 samples), followed by the naive Bayes algorithm 
(0.0168 seconds for 30000 samples) and discriminant analysis (0.0196 seconds for 30000 samples). They 
also have comparable accuracy rates. Table 3 shows that the proposed EDWC algorithm using an SVM 
obtains the highest accuracy of 83.6685%; the other algorithms also show a relatively good performance of 
approximately 83.6%. 


Table 3. Accuracy and average classification time for different machine learning algorithms [seconds] 
Number of classification samples 
6000 10800 15200 20400 25200 ~— 30000 


Algorithms Accuracy % 


CTR 83.6845 0.0060 0.0076 0.0093 0.0103 0.0116 0.0125 
DCA 83.6079 0.0067 0.0102 0.0126 0.0155 0.0172 0.0196 
NBC 83.6653 0.0072 0.0106 0.0126 0.0138 0.0156 0.0168 
KNN 83.6238 0.6377 1.1434 1.6447 2.1472 2.6516 3.1536 
SVN 83.6685 0.0091 0.0129 0.0161 0.0196 0.0230 0.0260 


4.2.2. Detection probability of different algorithms 

The ROC curve is a metric adopted to examine the properties of classifiers. Figure 6 analyzes the 
performance of individual proposed EDWC schemes in terms of the ROC curves. The true positive ratio 
(TPR), on the y-axis, indicates the number of outputs in which the actual and predicted classes are identical. 
The x-axis represents the false positive ratio (FPR), which is the ratio of cases in which the real and predicted 
labels are different. From the comparison of the curves, we can see that the KNN classifier has the highest 
prediction efficiency, followed by the CTR and NBC classification algorithms. However, the difference is 
not very great. It has been shown that combining descriptive statistics and spectral measurements in model 
development can have a significant effect on waveform classification. 
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Figure 6. Receiver operating characteristic curve of the proposed classifiers 


4.3. Real-time observation 
In the real application experiment, the experimental setup was put in place and captured the RF 
signals of public amateur radio for a week (11-17 October 2020) in a particular band. 


4.3.1. Observed signal level 

Figure 7 presents the comparison plots of the energy level of each frequency band. The x-axis 
indicates the number of samples, and the y-axis represents the size of the normalized upper envelope of each 
signal sample. Each frequency band has a different energy level for each captured RF signal over time, which 
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shows how the usage of the signals varies in the observation period. These bands are an essential part of 
determining the threshold level and the level of noise that occurs in each frequency range as well. From the 
comparison of the graphs, we can see that the frequency range of band one is used the most, and the least 


active frequency range is band four. 


Normalized Upper Envelope 


30k 40k 


50k 


60k 70k 
Number of Samples 


Figure 7. Normalized upper envelope of each signal sample 


4.3.2. Counting and decision making 


Table 4 shows the results obtained from the experiments to process the actual public amateur signal 
with the developed EDWC algorithm. The results are divided into four main groups according to the 
frequency range of the detected signal and the machine learning used for processing to classify the waveform 


characteristics. In addition, the display is divided into five groups: Ho, H1, Co, C1, and C3. 


Table 4. Counting and decision making for real-time observations 


: Category 
Band Algorithms Ho H, C c C 
CTR 25303 1085 
DCA 3868 22520 
1 NBC 310572 26388 310572 4171 22217 
KNN 2821 23567 
SVN 102 26286 
CTR 32020 2971 
DCA 2394 32597 
2 NBC 250129 34991 250129 3572 31419 
KNN 1841 33150 
SVN 6 34985 
CTR 31258 356 
DCA 8720 22894 
3 NBC 193026 31614 193026 8484 23130 
KNN 8382 23232 
SVN 8 31606 
CTR 7103 32 
DCA 174 6961 
4 NBC 131105 7135 131105 188 6947 
KNN 174 6961 
SVN 2 7133 


In the case of Hy and H4, we focus primarily on the level of energy, and we can see that the signal 
levels were placed in groups of 26388, 34991, 31614, and 7135 records in band 1, band 2, band 3, and band 
4, respectively. In frequency band 1, for example, the signals, which are higher than the threshold level and 
are classified as voice waveforms, are presented in column C4. With discriminant analysis, naive Bayes, and 
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k-nearest neighbors, the numbers of signals in this group are approximately the same: 3868, 4171, and 2821, 
respectively. However, the analysis using decision trees and support vector machines produced very different 
results. The results are similar across all four frequency bands. The overall results show that the proposed EDWC 
algorithm can be used in practical applications, especially in measuring the usage rate of each frequency band, 
including the number of times in which the signal is emitted, such as with disturbance and in case C}. 


5. CONCLUSION 

In this paper, we propose a novel energy detection and waveform feature classification (EDWC) 
algorithm to allow the detection of speech signals in public frequency bands based on energy detection 
techniques and supervised machine learning workflows. To further promote distributed decision making, we 
develop a waveform decision scheme for classifying voice signals and noise signals after the demodulation 
process by applying descriptive statistics and spectral measurements. We use supervised classifiers such as 
decision trees, discriminant analysis, naive Bayes, k-nearest neighbors, and support vector machines. The 
received energy level and demodulated waveform characteristics are considered as a feature vector for 
classifying the input signal. We evaluate the performance of the proposed EDWC algorithm in terms of the 
average training duration, classification time, and receiver operating characteristic curves. A simulation and 
experimental results using real FM broadcast radio signals demonstrate that the application of waveform 
properties as predictor parameters in machine learning algorithms improves the capability of waveform 
classification. Meanwhile, the EDWC schemes using discriminant analysis, a naive Bayes classifier, and 
k-nearest neighbors deliver similar decision outcomes in real-time public RF signal detection and 
classification. Our proposed EDWC framework can work efficiently and can also distinguish and classify 
signals. It shows the actual usage rate of each frequency band as well as the number of times a signal is 
generated with disturbance, which is an indispensable tool for analyzing data and monitoring the public 
spectrum usage of governments and related agencies. 
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